深入理解JavaScript内部原理(5): function
本文是翻译http://dmitrysoshnikov.com/ecmascript/chapter-5-functions/#introduction
概要
In this article we will talk about one of the general ECMAScript objects — about functions. In particular, we will go through various types of functions, will define how each type influencesvariables object of a context and what is contained in the scope chain of each function. We will answer the frequently asked questions such as: “is there any difference (and if there are, what are they?) between functions created as follows:
在这一章节中,我们来探讨下ECMAScript中一个很重要的对象-函数。我们将详细讲解一下各种类型的函数是如何影响上下文的变量对象以及每个函数的作用域链都包含什么,我们将回答诸如像下面这样的问题:下面声明的函数有什么区别么?(如果有,区别是什么)。
var foo = function () { ... }; |
from functions defined in a “habitual” way?”:
传统的函数声明是:
function foo() { ... } |
Or, “why in the next call, the function has to be surrounded with parentheses?”:
或者,下面的函数调用,为什么要用括号包围起来。
( function () { ... })(); |
Since these articles relay on earlier chapters, for full understanding of this part it is desirable to read Chatper 2. Variable object and Chapter 4. Scope chain, since we will actively use terminology from these chapters.
But let us give one after another. We begin with consideration of function types.
函数类型
In ECMAScript there are three function types and each of them has its own features.
在ECMAScript中,有三种不同的函数类型,并且他们都有自己的特点。
函数声明
函数声明(简写FD)是这样的一个函数
- has an obligatory name;
- in the source code position it is positioned: either at the Program level or directly in the body of another function (FunctionBody);
- is created on entering the context stage;
- influences variable object;
- and is declared in the following way:
- 有一个特定的名称
- 在源码中的位置:要么处于程序级(Program level),要么处于其它函数的主体(FunctionBody)中
- 在进入上下文阶段创建
- 影响变量对象
- 以下面的方式声明
function exampleFunc() { ... } |
The main feature of this type of functions is that only they influence variable object (they are stored in the VO of the context). This feature defines the second important point (which is a consequence of a variable object nature) — at the code execution stage they are already available (since FD are stored in the VO on entering the context stage — before the execution begins).
这种类型的函数最重要的特点就是它影响变量对象(存储在变量对象的上下文中),这个特性也说明了第二个很重要的观点(它是变量对象特性的结果)在代码执行阶段它们已经可用(因为FD在进入上下文阶段已经存在于VO中——代码执行之前)。
Example (function is called before its declaration in the source code position):
foo(); function foo() { alert( 'foo' ); } |
What’s also important is the position at which the funcion is defined in the source code (see the second bullet in the Function declaration definition above):
另外一个重点知识点是上述定义中的第二点——函数声明在源码中的位置:
// function can be declared: // 1) directly in the global context function globalFD() { // 2) or inside the body // of another function function innerFD() {} } |
These are the only two positions in code where a function may be declared (i.e. it is impossible to declare it in an expression position or inside a code block).
There’s one alternative to function declarations which is called function expressions, which we are about to cover.
只有这2个位置可以声明函数,也就是说:不可能在表达式位置或一个代码块中定义它。
另外一种可以取代函数声明的方式是函数表达式,解释如下:
函数表达式
函数表达式(简写FE)是这样的一个函数
- in the source code can only be defined at the expression position;
- can have an optional name;
- it’s definition has no effect on variable object;
- and is created at the code execution stage.
- 在源码中须出现在表达式的位置
- 有可选的名称
- 不会影响变量对象
- 在代码执行阶段创建
The main feature of this type of functions is that in the source code they are always in theexpression position. Here’s a simple example such assignment expression:
这种函数类型的主要特点在于它在源码中总是处在表达式的位置。最简单的一个例子就是一个赋值声明:
var foo = function () { ... }; |
This example shows how an anonymous FE is assigned to foo
variable. After that the function is available via foo
name — foo()
.
The definition states that this type of functions can have an optional name:
该例演示是让一个匿名函数表达式赋值给变量foo,然后该函数可以用foo这个名称进行访问——foo()。
同时和定义里描述的一样,函数表达式也可以拥有可选的名称:
var foo = function _foo() { ... }; |
What’s important here to note is that from the outside FE is accessible via variable foo
— foo()
, while from inside the function (for example, in the recursive call), it is also possible to use _foo
name.
When a FE is assigned a name it can be difficult to distinguish it from a FD. However, if you know the definition, it is easy to tell them apart: FE is always in the expression position. In the following example we can see various ECMAScript expressions in which all the functions are FE:
需要注意的是,在外部FE通过变量“foo”来访问——foo(),而在函数内部(如递归调用),有可能使用名称“_foo”。
如果FE有一个名称,就很难与FD区分。但是,如果你明白定义,区分起来就简单明了:FE总是处在表达式的位置。在下面的例子中我们可以看到各种ECMAScript 表达式:
// in parentheses (grouping operator) can be only an expression ( function foo() {}); // in the array initialiser – also only expressions [ function bar() {}]; // comma also operates with expressions 1, function baz() {}; |
The definition also states that FE is created at the code execution stage and is not stored in the variable object. Let’s see an example of this behavior:
表达式定义里说明:FE只能在代码执行阶段创建而且不存在于变量对象中,让我们来看一个示例行为:
// FE is not available neither before the definition // (because it is created at code execution phase), alert(foo); // "foo" is not defined ( function foo() {}); // nor after, because it is not in the VO alert(foo); // "foo" is not defined |
The logical question now is why do we need this type of functions at all? The answer is obvious — to use them in expressions and “not pollute” the variables object. This can be demonstrated in passing a function as an argument to another function:
相当一部分问题出现了,我们为什么需要函数表达式?答案是很显然的——在表达式中使用它们,”不会污染”变量对象。最简单的例子是将一个函数作为参数传递给其它函数。
function foo(callback) { callback(); } foo( function bar() { alert( 'foo.bar' ); }); foo( function baz() { alert( 'foo.baz' ); }); |
In case a FE is assigned to a variable, the function remains stored in memory and can later be accessed via this variable name (because variables as we know influence VO):
在上述例子里,FE赋值给了一个变量(也就是参数),函数将该表达式保存在内存中,并通过变量名来访问(因为变量影响变量对象),如下:
var foo = function () { alert( 'foo' ); }; foo(); |
Another example is creation of encapsulated scope to hide auxiliary helper data from external context (in the following example we use FE which is called right after creation):
另外一个例子是创建封装的闭包从外部上下文中隐藏辅助性数据(在下面的例子中我们使用FE,它在创建后立即调用):
var foo = {}; ( function initialize() { var x = 10; foo.bar = function () { alert(x); }; })(); foo.bar(); // 10; alert(x); // "x" is not defined |
We see that function foo.bar
(via its [[Scope]]
property) has access to the internal variable x
of function initialize
. And at the same time x
is not accessible directly from the outside. This strategy is used in many libraries to create “private” data and hide auxiliary entities. Often in this pattern the name of initializing FE is omitted:
我们看到函数foo.bar(通过[[Scope]]属性)访问到函数initialize的内部变量“x”。同时,“x”在外部不能直接访问。在许多库中,这种策略常用来创建”私有”数据和隐藏辅助实体。在这种模式中,初始化的FE的名称通常被忽略:
( function () { // initializing scope })(); |
Here’s another examples of FE which are created conditionally at runtime and do not pollute VO:
还有一个例子是:在代码执行阶段通过条件语句进行创建FE,不会污染变量对象VO。
var foo = 10; var bar = (foo % 2 == 0 ? function () { alert(0); } : function () { alert(1); } ); bar(); // 0 |
关于圆括号的问题
Let’s go back and answer the question from the beginning of the article — “why is it necessary to surround a function in parentheses if we want to call it right from it’s definition”. Here’s an answer to this question: restrictions of the expression statement.
According to the standard, the expression statement (ExpressionStatement) cannot begin with an opening curly brace — {
since it would be indistinguishable from the block, and also the expression statement cannot begin with a function
keyword since then it would be indistinguishable from thefunction declaration. I.e., if we try to define an immediately invoked function the following way (starting with a function
keyword):
让我们回头并回答在文章开头提到的问题——”为何在函数创建后的立即调用中必须用圆括号来包围它?”,答案就是:表达式句子的限制就是这样的。
根据标准,表达式语句不能以一个大括号{开始是因为他很难与代码块区分,同样,他也不能以函数关键字开始,因为很难与函数声明进行区分。即,所以,如果我们定义一个立即执行的函数,在其创建后立即按以下方式调用:
function () { ... }(); // or even with a name function foo() { ... }(); |
we deal with function declarations, and in both cases a parser will produce a parse error. However, the reasons of these parse errors vary.
If we put such a definition in the global code (i.e. on the Program
level), the parser should treat the function as declaration, since it starts with a function
keyword. And in first case we get aSyntaxError
because of absence of the function’s name (a function declaration as we said should always have a name).
我们使用了函数声明,上述2个定义,解释器在解释的时候都会报错,但是可能有多种原因。
如果在全局代码里定义(也就是程序级别),解释器会将它看做是函数声明,因为他是以function关键字开头,第一个例子,我们会得到SyntaxError错误,是因为函数声明没有名字(我们前面提到了函数声明必须有名字)。
In the second case we do have a name (foo
) and the function declaration should be created normally. But it doesn’t since we have another syntax error there — a grouping operator without an expression inside it. Notice, in this case it’s exactly a grouping operator which follows the function declaration, but not the parentheses of a function call! So if we had the following source:
第二个例子,我们有一个名称为foo的一个函数声明正常创建,但是我们依然得到了一个语法错误——没有任何表达式的分组操作符错误。在函数声明后面他确实是一个分组操作符,而不是一个函数调用所使用的圆括号。所以如果我们声明如下代码:
// "foo" is a function declaration // and is created on entering the context alert(foo); // function function foo(x) { alert(x); }(1); // and this is just a grouping operator, not a call! foo(10); // and this is already a call, 10 |
everything is fine since here we have two syntactic productions — a function declaration and agrouping operator with an expression (1
) inside it. The example above is the same as:
上述代码是没有问题的,因为声明的时候产生了2个对象:一个函数声明,一个带有1的分组操作,上面的例子可以理解为如下代码:
// function declaration function foo(x) { alert(x); } // a grouping operator // with the expression (1); // another grouping operator with // another (function) expression ( function () {}); // also - the expression inside ( "foo" ); // etc |
In case we had such a definition inside a statement, then as we said, there because of ambiguity we would get a syntax error:
根据规范,上述代码是错误的(一个表达式语句不能以function关键字开头),但下面的例子就没有报错,想想为什么?
if ( true ) function foo() {alert(1)} |
The construction above by the specification is syntactically incorrect (an expression statement cannot begin with a function
keyword), but as we will see below, none of the implementations provide the syntax error, but handle this case, though, every in it’s own manner.
Having all this, how should we tell the parser that what we really want it to call a function immediately after its creation? The answer is obvious. It’s should be a function expression, and nota function declaration. And the simplest way to create an expression is to use mentioned abovegrouping operator. Inside it always there is an expression. Thus, the parser distinguishes a code as a function expression (FE) and there is no ambiguity. Such a function will be created during theexecution stage, then executed, and then removed (if there are no references to it).
我们如果来告诉解释器:我就像在函数声明之后立即调用,答案是很明确的,你得声明函数表达式function expression,而不是函数声明function declaration,并且创建表达式最简单的方式就是用分组操作符括号,里边放入的永远是表达式,所以解释器在解释的时候就不会出现歧义。在代码执行阶段这个的function就会被创建,并且立即执行,然后自动销毁(如果没有引用的话)。
( function foo(x) { alert(x); })(1); // OK, it's a call, not a grouping operator, 1 |
In the example above the parentheses at the end (Arguments
production) are already call of the function, and not a grouping operator as it was in case of a FD.
上述代码就是我们所说的在用括号括住一个表达式,然后通过(1)去调用。
Notice, in the following example of the immediate invocation of a function, the surrounding parentheses are not required, since the function is already in the expression position and the parser knows that it deals with a FE which should be created at code execution stage:
注意,下面一个立即执行的函数,周围的括号不是必须的,因为函数已经处在表达式的位置,解析器知道它处理的是在函数执行阶段应该被创建的FE,这样在函数创建后立即调用了函数。
var foo = { bar: function (x) { return x % 2 != 0 ? 'yes' : 'no' ; }(1) }; alert(foo.bar); // 'yes' |
As we see, foo.bar
is a string but not a function as can seem at first inattentive glance. The function here is used only for initialization of the property — depending on the conditional parameter — it is created and called right after that.
就像我们看到的,foo.bar是一个字符串而不是一个函数,这里的函数仅仅用来根据条件参数初始化这个属性——它创建后并立即调用。
Therefore, the complete answer to the question “about parentheses” is the following:
Grouping parentheses are needed when a function is not at the expression position and if we want to call it immediately right after its creation — in this case we just manually transform the function to FE.
In case when a parser knows that it deals with a FE, i.e. the function is already at the expression position — the parentheses are not required.
因此,”关于圆括号”问题完整的答案如下:当函数不在表达式的位置的时候,分组操作符圆括号是必须的——也就是手工将函数转化成FE。如果解析器知道它处理的是FE,就没必要用圆括号
Apart from surrounding parentheses it is possible to use any other way of transformation of a function to FE type. For example:
除了大括号以外,如下形式也可以将函数转化为FE类型,例如:
1, function () { alert( 'anonymous function is called' ); }(); // or this one ! function () { alert( 'ECMAScript' ); }(); // and any other manual // transformation ... |
However, grouping parentheses are just the most widespread and the elegant way to do it.
By the way, the grouping operator can surround the function description as without call parentheses, and also including call parentheses. I.e. both expressions below are correct FE:
但是,在这个例子中,圆括号是最简洁的方式。
顺便提一句,组表达式包围函数描述可以没有调用圆括号,也可包含调用圆括号,即,下面的两个表达式都是正确的FE。
( function () {})(); ( function () {}()); |
实现扩展: 函数语句
The following example shows a code in which none of implementations processes accordingly to the specification:
下面的代码,根据贵方任何一个function声明都不应该被执行:
if ( true ) { function foo() { alert(0); } } else { function foo() { alert(1); } } foo(); // 1 or 0 ? test in different implementations |
Here it is necessary to say that according to the standard this syntactic construction in general isincorrect, because as we remember, a function declaration (FD) cannot appear inside a code block(here if
and else
contain code blocks). As it has been said, FD can appear only in two places: at the Program level or directly inside a body of another function.
The above example is incorrect because the code block can contain only statements. And the only place in which function can appear within a block is one of such statements — the expression statement. But by definition it cannot begin with an opening curly brace (since it is indistinguishable from the code block) or a function
keyword (since it is indistinguishable from FD).
However in section of errors processing the standard allows for implementations extensions of program syntax. And one of such extensions can be seen in case of functions which appear in blocks. All implementations existing today do not throw an exception in this case and process it. But every in its own way.
Presence of if
-else
branches assumes a choice is being made which of the two function will be defined. Since this decision is to be made at runtime, that implies that a function expression (FE)should be used. However the majority of implementations will simply create both of the function declarations (FD) on entering the context stage, but since both of the functions use the same name, only the last declared function will get called. In this example the function foo
shows 1
although theelse
branch never executes.
However, SpiderMonkey implementation treats this case in two ways: on the one hand it does not consider such functions as declarations (i.e. the function is created on the condition at the code execution stage), but on the other hand they are not real function expressions since they cannot be called without surrounding parentheses (again the parse error — “indistinguishably from FD”) and they are stored in the variable object.
My opinion is that SpiderMonkey handles this case correctly, separating the own middle type of function — (FE + FD). Such functions are correctly created due the time and according to conditions, but also unlike FE, and more like FD, are available to be called from the outside. This syntactic extension SpiderMonkey names as Function Statement (in abbreviated form FS); this terminology ismentioned in MDC. JavaScript inventor Brendan Eich also noticed this type of functions provided by SpiderMonkey implementation.
这里有必要说明的是,按照标准,这种句法结构通常是不正确的,因为我们还记得,一个函数声明(FD)不能出现在代码块中(这里if和else包含代码块)。我们曾经讲过,FD仅出现在两个位置:程序级(Program level)或直接位于其它函数体中。
因为代码块仅包含语句,所以这是不正确的。可以出现在块中的函数的唯一位置是这些语句中的一个——上面已经讨论过的表达式语句。但是,按照定义它不能以大括号开始(既然它有别于代码块)或以一个函数关键字开始(既然它有别于FD)。
但是,在标准的错误处理章节中,它允许程序语法的扩展执行。这样的扩展之一就是我们见到的出现在代码块中的函数。在这个例子中,现今的所有存在的执行都不会抛出异常,都会处理它。但是它们都有自己的方式。
if-else分支语句的出现意味着一个动态的选择。即,从逻辑上来说,它应该是在代码执行阶段动态创建的函数表达式(FE)。但是,大多数执行在进入上下文阶段时简单的创建函数声明(FD),并使用最后声明的函数。即,函数foo将显示”1″,事实上else分支将永远不会执行。
但是,SpiderMonkey (和TraceMonkey)以两种方式对待这种情况:一方面它不会将函数作为声明处理(即,函数在代码执行阶段根据条件创建),但另一方面,既然没有括号包围(再次出现解析错误——”与FD有别”),他们不能被调用,所以也不是真正的函数表达式,它储存在变量对象中。
我个人认为这个例子中SpiderMonkey 的行为是正确的,拆分了它自身的函数中间类型——(FE+FD)。这些函数在合适的时间创建,根据条件,也不像FE,倒像一个可以从外部调用的FD,SpiderMonkey将这种语法扩展 称之为函数语句(缩写为FS);该语法在MDC中提及过。
命名函数表达式的特性
In case FE has a name (named function expression, in abbreviated form NFE) one important feature arises. As we know from definition (and as we saw in the examples above) function expressions do not influence variable object of a context (this means that it’s impossible to call them by namebefore or after their definition). However, FE can call itself by name in the recursive call:
当函数表达式FE有一个名称(称为命名函数表达式,缩写为NFE)时,将会出现一个重要的特点。从定义(正如我们从上面示例中看到的那样)中我们知道函数表达式不会影响一个上下文的变量对象(那样意味着既不可能通过名称在函数声明之前调用它,也不可能在声明之后调用它)。但是,FE在递归调用中可以通过名称调用自身。
( function foo(bar) { if (bar) { return ; } foo( true ); // "foo" name is available })(); // but from the outside, correctly, is not foo(); // "foo" is not defined |
Where is the name “foo” stored? In the activation object of foo
? No, since nobody has defined any “foo” name inside foo
function. In the parent variable object of a context which creates foo
? Also not, remember the definition — FE does not influence the VO — what is exactly we see when callingfoo
from the outside. Where then?
Here’s how it works: when the interpreter at the code execution stage meets named FE, before creating FE, it creates auxiliary special object and adds it in front of the current scope chain. Then it creates FE itself at which stage the function gets the [[Scope]]
property (as we know from theChapter 4. Scope chain) — the scope chain of the context which created the function (i.e. in[[Scope]]
there is that special object). After that, the name of FE is added to the special object as unique property; value of this property is the reference to the FE. And the last action is removing that special object from the parent scope chain. Let’s see this algorithm on the pseudo-code:
foo”储存在什么地方?在foo的活动对象中?不是,因为在foo中没有定义任何”foo”。在上下文的父变量对象中创建foo?也不是,因为按照定义——FE不会影响VO(变量对象)——从外部调用foo我们可以实实在在的看到。那么在哪里呢?
以下是关键点。当解释器在代码执行阶段遇到命名的FE时,在FE创建之前,它创建了辅助的特定对象,并添加到当前作用域链的最前端。然后它创建了FE,此时(正如我们在第四章 作用域链知道的那样)函数获取了[[Scope]] 属性——创建这个函数上下文的作用域链)。此后,FE的名称添加到特定对象上作为唯一的属性;这个属性的值是引用到FE上。最后一步是从父作用域链中移除那个特定的对象。让我们在伪码中看看这个算法:
specialObject = {}; Scope = specialObject + Scope; foo = new FunctionExpression; foo.[[Scope]] = Scope; specialObject.foo = foo; // {DontDelete}, {ReadOnly} delete Scope[0]; // remove specialObject from the front of scope chain |
Thus, from the outside this function name is not available (since it is not present in parent scope), but special object which has been saved in [[Scope]]
of a function and there this name is available.
It is necessary to note however, that some implementations, for example Rhino, save this optional name not in the special object but in the activation object of the FE. Implementation from Microsoft — JScript, completely breaking FE rules, keeps this name in the parent variables object and the function becomes available outside.
因此,在函数外部这个名称不可用的(因为它不在父作用域链中),但是,特定对象已经存储在函数的[[scope]]中,在那里名称是可用的。
但是需要注意的是一些实现(如Rhino)不是在特定对象中而是在FE的激活对象中存储这个可选的名称。Microsoft 中的执行完全打破了FE规则,它在父变量对象中保持了这个名称,这样函数在外部变得可以访问。
NFE和SpiderMonkey
Let’s have a look at how different implementations handle this problem. Some versions of SpiderMonkey have one feature related to special object which can be treated as a bug (although all was implemented according to the standard, so it is more of an editorial defect of the specification). It is related to the mechanism of the identifier resolution: the scope chain analysis istwo-dimensional and when resolving an identifier it considers the prototype chain of every object in the scope chain as well.
说到实现,部分版本的SpiderMonkey有一个与上述提到的特殊对象相关的特性,这个特性也可以看作是个bug(既然所有的实现都是严格遵循标准的,那么这个就是标准的问题了)。 此特性和标识符处理相关: 作用域链的分析是二维的,在标识符查询的时候,还要考虑作用域链中每个对象的原型链。
We can see this mechanism in action if we define a property in Object.prototype
and use a “nonexistent” variable from the code. In the following example when resolving the name x
the global object is reached without finding x
. However since in SpiderMonkey the global object inherits from Object.prototype
the name x
is resolved there:
当在Object.prototype对象上定义一个属性,并将该属性值指向一个“根本不存在”的变量时,就能够体现该特性。 比如,如下例子中的变量“x”,在查询过程中,通过作用域链,一直到全局对象也是找不到“x”的。 然而,在SpiderMonkey中,全局对象继承自Object.prototype,于是,对应的值就在该对象中找到了:
Object .prototype.x = 10; ( function () { alert(x); // 10 })(); |
Activation objects do not have prototypes. With the same start conditions, it is possible to see the same behavior in the example with inner function. If we were to define a local variable x
and declare inner function (FD or anonymous FE) and then to reference x
from the inner function, this variable would be resolved normally in the parent function context (i.e. there, where it should be and is), instead of in Object.prototype
:
活跃对象是没有原型一说的。可以通过内部函数还证明。 如果在定义一个局部变量“x”并声明一个内部函数(FD或者匿名的FE),然后,在内部函数中引用变量“x”,这个时候该变量会在上层函数上下文中查询到(理应如此),而不是在Object.prototype中:
Object .prototype.x = 10; function foo() { var x = 20; // function declaration function bar() { alert(x); } bar(); // 20, from AO(foo) // the same with anonymous FE ( function () { alert(x); // 20, also from AO(foo) })(); } foo(); |
Some implementations set a prototype for activation objects, which is an exception compared to most of other implementations. So, in the Blackberry implementation value x
from the above example is resolved to 10
. I.e. do not reach activation object of foo
since value is found in Object.prototype
:
在有些实现中,存在这样的异常:它们会在活跃对象设置原型。比方说,在Blackberry的实现中,上述例子中变量“x”值就会变成10。 因为,“x”从Object.prototype中就找到了:
AO(bar FD or anonymous FE) -> no -> AO(bar FD or anonymous FE).[[Prototype]] -> yes - 10 |
And we can see absolutely the same situation in SpiderMonkey in case of special object of a named FE. This special object (by the standard) is a normal object — “as if by expression new Object()
“, and accordingly it should be inherited from Object.prototype
, what is exactly what can be seen in SpiderMonkey implementation (but only up to version 1.7). Other implementations (including newer versions of SpiderMonkey) do not set a prototype for that special object:
当出现有名字的FE的特殊对象的时候,在SpiderMonkey中也是有同样的异常。该特殊对象是常见对象 —— “和通过new Object()表达式产生的一样”。 相应地,它也应当继承自Object.prototype,上述描述只针对SpiderMonkey(1.7版本)。其他的实现(包括新的TraceMonkey)是不会给这个特殊对象设置原型的:
function foo() { var x = 10; ( function bar() { alert(x); // 20, but not 10, as don't reach AO(foo) // "x" is resolved by the chain: // AO(bar) - no -> __specialObject(bar) -> no // __specialObject(bar).[[Prototype]] - yes: 20 })(); } Object .prototype.x = 20; foo(); |
NFE and JScript
ECMAScript implementation from Microsoft — JScript which is currently built into Internet Explorer (up to JScript 5.8 — IE8) has a number of bugs related with named function expressions (NFE). Every of these bugs completely contradicts ECMA-262-3 standard; some of them may cause serious errors.
First, JScript in this case breaks the main rule of FE that they should not be stored in the variable object by name of functions. An optional FE name which should be stored in the special object and be accessible only inside the function itself (and nowhere else) here is stored directly in the parent variable object. Moreover, named FE is treated in JScript as the function declaration (FD), i.e. is created on entering the context stage and is available before the definition in the source code:
微软的实现——JScript,是IE的JS引擎(截至本文撰写时最新是JScript5.8——IE8),该引擎与NFE相关的bug有很多。每个bug基本上都和ECMA-262-3rd标准是完全违背的。 有些甚至会引发严重的错误。
第一,针对上述这样的情况,JScript完全破坏了FE的规则:不应当将函数名字保存在变量对象中的。 另外,FE的名字应当保存在特殊对象中,并且只有在函数自身内部才可以访问(其他地方均不可以)。而JScript却将其直接保存在上层上下文的变量对象中。 并且,JScript居然还将FE以FD的方式处理,在进入上下文的时候就将其创建出来,并在定义之前就可以访问到:
// FE is available in the variable object // via optional name before the // definition like a FD testNFE(); ( function testNFE() { alert( 'testNFE' ); }); // and also after the definition // like FD; optional name is // in the variable object testNFE(); |
As we see, complete violation of rules.
正如大家所见,完全破坏了FE的规则。
Secondly, in case of assigning the named FE to a variable at declaration, JScript creates two different function objects. It is difficult to name such behavior as logical (especially considering that outside of NFE its name should not be accessible at all):
第二,在声明同时,将NFE赋值给一个变量的时候,JScript会创建两个不同的函数对象。 这种行为感觉完全不符合逻辑(特别是考虑到在NFE外层,其名字根本是无法访问到的):
var foo = function bar() { alert( 'foo' ); }; alert( typeof bar); // "function", NFE again in the VO – already mistake // but, further is more interesting alert(foo === bar); // false! foo.x = 10; alert(bar.x); // undefined // but both function make // the same action foo(); // "foo" bar(); // "foo" |
Again we see the full disorder.
However it is necessary to notice that if to describe NFE separately from assigning to variable (for example via the grouping operator), and only after that to assign it to a variable, then check on equality returns true
just like it would be one object:
然而,要注意的是: 当将NFE和赋值给变量这两件事情分开的话(比如,通过组操作符),在定义好后,再进行变量赋值,这样,两个对象就相同了,返回true:
( function bar() {}); var foo = bar; alert(foo === bar); // true foo.x = 10; alert(bar.x); // 10 |
This moment can be explained. Actually, again two objects are created but after that remains, really, only one. If again to consider that NFE here is treated as the function declaration (FD) then on entering the context stage FD bar is created. After that, already at code execution stage the second object — function expression (FE) bar is created and is not saved anywhere. Accordingly, as there is no any reference on FE bar
it is removed. Thus there is only one object — FD bar
, the reference on which is assigned to foo
variable.
这个时候就好解释了。实施上,一开始的确创建了两个对象,不过之后就只剩下一个了。这里将NFE以FD的方式来处理,然后,当进入上下文的时候,FD bar就创建出来了。 在这之后,到了执行代码阶段,又创建出了第二个对象 —— FE bar,该对象不会进行保存。相应的,由于没有变量对其进行引用,随后FE bar对象就被移除了。 因此,这里就只剩下一个对象——FD bar对象,对该对象的引用就赋值给了foo变量。
Thirdly, regarding the indirect reference to a function via arguments.callee
, it references that object with which name a function is activated (to be exact — functions since there are two objects):
第三,通过arguments.callee对一个函数进行间接引用,它引用的是和激活函数名一致的对象(事实上是——函数,因为有两个对象):
var foo = function bar() { alert([ arguments.callee === foo, arguments.callee === bar ]); }; foo(); // [true, false] bar(); // [false, true] |
Fourthly, as JScript treats NFE as usual FD, it is not submitted to conditional operators rules, i.e. just like a FD, NFE is created on entering the context and the last definition in a code is used:
第四,JScript会将NFE以FD来处理,但当遇到条件语句又不遵循此规则了。比如说,和FD那样,NFE会在进入上下文的时候就创建出来,这样最后一次定义的就会被使用:
var foo = function bar() { alert(1); }; if ( false ) { foo = function bar() { alert(2); }; } bar(); // 2 foo(); // 1 |
This behavior can also be “logically” explained. On entering the context stage the last met FD with name bar is created, i.e. function with alert(2)
. After that, at code execution stage already new function — FE bar
is created, the reference on which is assigned to foo
variable. Thus (as further in the code the if-block with a condition false
is unreachable), foo
activation produces alert(1)
. The logic is clear, but taking into account IE bugs, I have quoted “logically” word since such implementation is obviously broken and depends on JScript bugs.
And the fifth NFE bug in JScript is related with creation of properties of global object via assigning value to an unqualified identifier (i.e. without var
keyword). Since NFE is treated here as FD and, accordingly, stored in the variable object, assignment to unqualified identifier (i.e. not to variablebut to usual property of global object) in case when the function name is the same as unqualified identifier, this property does not become global.
上述行为从逻辑上也是可以解释通的: 当进入上下文的时候,最后一次定义的FD bar被创建出来(有alert(2)的函数), 之后到了执行代码阶段又一个新的函数 —— FE bar被创建出来,对其引用赋值给了变量foo。因此(if代码块中由于判断条件是false,因此其代码块中的代码永远不会被执行到)foo函数的调用会打印出1。 尽管“逻辑上”是对的,但是这个仍然算是IE的bug。因为它明显就破坏了实现的规则,所以我这里用了引号“逻辑上”。
第五个JScript中NFE的bug和通过给一个未受限的标识符赋值(也就是说,没有var关键字)来创建全局对象的属性相关。 由于这里NFE会以FD的方式来处理,并相应地会保存在变量对象上,赋值给未受限的标识符(不是给变量而是给全局对象的一般属性), 当函数名和标识符名字相同的时候,该属性就不会是全局的了。
( function () { // without var not a variable in the local // context, but a property of global object foo = function foo() {}; })(); // however from the outside of // anonymous function, name foo // is not available alert( typeof foo); // undefined |
Again, the “logic” is clear: the function declaration foo gets to the activation object of a local context of anonymous function on entering the context stage. And at the moment of code execution stage, the name foo already exists in AO, i.e. is treated as local. Accordingly, at assignment operation there is simply an update of already existing in AO property foo, but not creation of new property of global object as should be according to the logic of ECMA-262-3.
这里从“逻辑上”又是可以解释通的: 进入上下文时,函数声明在匿名函数本地上下文的活跃对象中。 当进入执行代码阶段的时候,因为foo这个名字已经在AO中存在了(本地),相应地,赋值操作也只是简单的对AO中的foo进行更新而已。 并没有在全局对象上创建新的属性。
通过Function构造器创建的函数
This type of function objects is discussed separately from FD and FE since it also has its own features. The main feature is that the [[Scope]]
property of such functions contains only global object:
这类函数有别于FD和FE,有自己的专属特性: 它们的[[Scope]]属性中只包含全局对象:
var x = 10; function foo() { var x = 20; var y = 30; var bar = new Function ( 'alert(x); alert(y);' ); bar(); // 10, "y" is not defined } |
We see that the [[Scope]] of bar function does not contain AO of foo context — the variable “y” is not accessible and the variable “x” is taken from the global context. By the way, pay attention, theFunction constructor can be used both with new keyword and without it, in this case these variants are equivalent.
我们看到bar函数的[[Scope]]属性并未包含foo上下文的AO —— 变量“y”是无法访问的,并且变量“x”是来自全局上下文。 顺便提下,这里要注意的是,Function构造器可以通过new关键字和省略new关键字两种用法。上述例子中,这两种用法都是一样的。
The other feature of such functions is related with Equated Grammar Productions and Joined Objects. This mechanism is provided by the specification as suggestion for the optimization (however, implementations have the right not to use such optimization). For example, if we have an array of 100 elements which is filled in a loop with functions, then implementation can use this mechanism of joined objects. As a result only one function object for all elements of an array can be used:
此类函数其他特性则和同类语法产生式以及联合对象有关。 该机制在标准中建议在作优化的时候采用(当然,具体的实现者也完全有权利不使用这类优化)。比方说,有100元素的数组,在循环数组过程中会给数组每个元素赋值(函数), 这个时候,实现的时候就可以采用联合对象的机制了。这样,最终所有的数组元素都会引用同一个函数(只有一个函数):
var a = []; for ( var k = 0; k < 100; k++) { a[k] = function () {}; // possibly, joined objects are used } |
But functions created via Function constructor are never joined:
但是,通过Function构造器创建的函数就无法使用联合对象了:
var a = []; for ( var k = 0; k < 100; k++) { a[k] = Function ( '' ); // always 100 different funcitons } |
Another example related with joined objects:
下面是另外一个和联合对象相关的例子:
function foo() { function bar(z) { return z * z; } return bar; } var x = foo(); var y = foo(); |
Here also implementation has the right to join objects x and y (and to use one object) because functions physically (including their internal [[Scope]] property) are not distinguishable. Therefore, the functions created via Function constructor always require more memory resources.
上述例子,在实现过程中同样可以使用联合对象。来使得x和y引用同一个对象,因为函数(包括它们内部的[[Scope]]属性)物理上是不可分辨的。 因此,通过Function构造器创建的函数总是会占用更多内存资源。
函数创建的算法
The pseudo-code of function creation algorithm (except steps with joined objects) is described below. This description helps to understand in more detail which function objects exist in ECMAScript. The algorithm is identical for all function types.
如下所示使用伪代码表示的函数创建的算法(不包含联合对象的步骤)。有助于理解ECMAScript中的函数对象。此算法对所有函数类型都是一样的。
F = new NativeObject(); // 属性[[Class]] is "Function" F.[[Class]] = "Function" // 函数对象的原型 F.[[Prototype]] = Function.prototype // 对函数自身的引用 // [[Call]] is activated by call expression F() // 创建一个新的上下文 F.[[Call]] = <reference to function> // built in general constructor of objects 内置构造器 // [[Construct]] is activated via "new" keyword [[Construct]]是在new 关键字的时候激活。 // and it is the one who allocates memory for new 它会为新对象申请内存 // objects; then it calls F.[[Call]] // to initialize created objects passing as // "this" value newly created object F.[[Construct]] = internalConstructor // scope chain of the current context // i.e. context which creates function F 当前上下文的作用域链 F.[[Scope]] = activeContext.Scope // if this functions is created // via new Function(...), then 如果是通过new 运算符来创建的,则 F.[[Scope]] = globalContext.Scope // number of formal parameters 形参的个数 F.length = countParameters // a prototype of created by F objects 通过F创建出来的原型 __objectPrototype = new Object(); __objectPrototype.constructor = F // {DontEnum}, is not enumerable in loops F.prototype = __objectPrototype return F
Pay attention, F.[[Prototype]] is a prototype of the function (constructor) and F.prototype is a prototype of objects created by this function (because often there is a mess in terminology, andF.prototype in some articles is named as a “prototype of the constructor” that is incorrect).
要注意的是,F.[[Prototype]]是函数(构造器)的原型,而F.prototype是通过该函数创建出来的对象的原型(因为通常对这两个概念都会混淆,在有些文章中会将F.prototype叫做“构造器的原型”,这是错误的)。
结论
This article has turned out rather big; however, we will mention functions again when will discuss their work as constructors in one of chapters about objects and prototypes which follow. As always, I am glad to answer your questions in comments.
本文介绍了很多关于函数的内容;不过在后面的关于对象和原型的文章中,还会提到函数作为构造器是如何工作的。
附加阅读
- 13. — Function Definition;
- 15.3 — Function Objects.